Differential Risk of SARS-CoV-2 Infection by Occupation: Evidence from the Virus Watch prospective cohort study in England and Wales

Background Workers across different occupations vary in their risk of SARS-CoV-2 infection, but the direct contribution of occupation to this relationship is unclear. This study aimed to investigate how infection risk differed across occupational groups in England and Wales up to April 2022, after adjustment for potential confounding and stratification by pandemic phase. Methods Data from 15,190 employed/self-employed participants in the Virus Watch prospective cohort study were used to generate risk ratios for virologically- or serologically-confirmed SARS-CoV-2 infection using robust Poisson regression, adjusting for socio-demographic and health-related factors and non-work public activities. We calculated attributable fractions (AF) amongst the exposed for belonging to each occupational group based on adjusted risk ratios (aRR). Results Increased risk was seen in nurses (aRR = 1.44, 1.25–1.65; AF = 30%, 20–39%), doctors (aRR = 1.33, 1.08–1.65; AF = 25%, 7–39%), carers (1.45, 1.19–1.76; AF = 31%, 16–43%), primary school teachers (aRR = 1.67, 1.42- 1.96; AF = 40%, 30–49%), secondary school teachers (aRR = 1.48, 1.26–1.72; AF = 32%, 21–42%), and teaching support occupations (aRR = 1.42, 1.23–1.64; AF = 29%, 18–39%) compared to office-based professional occupations. Differential risk was apparent in the earlier phases (Feb 2020—May 2021) and attenuated later (June—October 2021) for most groups, although teachers and teaching support workers demonstrated persistently elevated risk across waves. Conclusions Occupational differences in SARS-CoV-2 infection risk vary over time and are robust to adjustment for socio-demographic, health-related, and non-workplace activity-related potential confounders. Direct investigation into workplace factors underlying elevated risk and how these change over time is needed to inform occupational health interventions. Supplementary Information The online version contains supplementary material available at 10.1186/s12995-023-00371-9.


Introduction
Notable occupational inequalities in infection risk have emerged during the Coronavirus Disease 2019 (COVID- 19) pandemic. Research and surveillance data across various global regions have repeatedly indicated elevated risk of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in workers in various essential and/or public-facing industries, such as health and social care, transportation, education, and cleaning and service occupations [1][2][3][4][5] compared to other workers or the adult population. Occupational differences in the ability to work from home, the frequency and intensity of workplace exposure to other people, environmental features of the workspace, and the implementation of infection control procedures plausibly contribute to differential risk of infection and transmission at work [6][7][8]. However, occupation is intimately linked with other socio-demographic factors such as deprivation, household size, activities outside the workplace and health status, that can compound to influence infection risk [9,10]. Establishing the contribution of work-related exposure to occupational inequalities in infection risk consequently depends on careful consideration of other non-occupational factors.
Few estimates of the effect of occupation on SARS-CoV-2 infection risk or outcomes have comprehensively accounted for sociodemographic confounding beyond age and sex.
Age, sex, geographic factors, education, living conditions, and pre-pandemic health were estimated to account for 70-80% of the effect of occupation on COVID-19 mortality in the UK in 2020 [11]. Healthcare, care, and some service and transport occupations (among men) and elementary cleaning and plant workers (among women) demonstrated elevated mortality compared to all other occupations, but the strength of these estimates was greatly attenuated by adjustment. While these findings indicate the importance of comprehensive adjustment, mortality data are strongly affected by clinical risk factors and the impact of work-related factors on differential infection risk cannot therefore be inferred from these findings. Data from Germany [12] (February -September 2020) and Sweden [13] (January 2020 -February 2021) indicated elevated risk of infection amongst essential workers -including health, care, and service workers -compared to non-essential workers across the respective study periods, after adjustment for a range of socio-demographic factors. However, occupational differences in risk may vary by global region and comparative investigation for the UK is limited. Probability of antigen test positivity differed little across occupations after adjustment for age, sex, region, ethnicity, household composition, deprivation, ability to work from home, use of face coverings at work, and ability to socially distance at work, based on the UK Office for National Statistics (ONS) Coronavirus Infection Survey [14] between early September-early January 2021. However, the inclusion of work-related potential mediators in this analysis precludes disaggregating the impact of occupational and non-occupational factors.
Differential risk across occupations is also plausibly influenced by time, due to changes in public health interventions and restrictions-including sectoral closures, social distancing, and infection control in the workplace-as well as fluctuating levels of community transmission across the pandemic and changes in immunity due to infection or vaccination. Preliminary evidence from the UK and Norway suggests that occupational differences in infection risk vary across time, with health [3,3,[15][16][17] and social care workers [15,16] and transport workers [3] demonstrating elevated infection risk during the first pandemic wave and other public-facing occupations including education [15,16], manufacturing [15,16] and food service as well as transport workers [3] demonstrating elevated risk in the second wave. More recent data including the period of relaxation of pandemic restrictions in the UK are lacking, as are estimates over time comprehensively adjusted for non-occupational factors.
Using data from a prospective community cohort study in England and Wales (Virus Watch) [18], this study aimed to extend current understanding of the direct effect of occupation on SARS-CoV-2 infection risk over time. Specific objectives were: (1) to estimate the relative risk of SARS-CoV-2 infection by occupation across the pandemic, adjusting for socio-demographic and healthrelated factors and non-work public activities; (2) to investigate whether occupational infection risk differed across pandemic waves; and (3) to estimate the attributable fraction amongst the exposed for different occupations overall and by pandemic wave.

Participants
Participants in the current study (n = 15,190) were an adult sub-cohort of the Virus Watch longitudinal cohort study (n = 58,692 as of 12/02/2022 when cohort recruitment was completed). Participants were included in the present study if they were (1) >= 16 years, (2) in employment or self-employment and reported their occupation upon study registration, and (3) completed at least one monthly survey between November 2020 and March 2022 concerning their activities across a recent week. Further detail of the full Virus Watch cohort study, including inclusion criteria for the full cohort, can be obtained from the study protocol [18].  [19]. Occupations were then classified into the following groups, which aimed to broadly reflect workplace environments while retaining, as far as possible, ONS-defined occupational skill groupings: administrative and secretarial occupations; healthcare occupations; indoor trade and process/ plant occupations; leisure and personal service occupations; managers, directors, and senior officials; outdoor trade occupations; sales and customer service occupations; social care and community protective services; teaching education and childcare occupations; transport and mobile machine operatives; and other professional and associate occupations (broadly office-based professional and associate professional occupations).
Where possible, we also extracted more specific occupational groupings based on three-digit SOC groups for occupations within the essential worker classification [21] and classified by the investigators as public facing/ frontline roles. These more detailed occupational groups were included where group sizes exceeded n = 100 and some SOC groupings were split or combined together to reflect working environment/role, to yield the following included groupings: nurses, doctors, warehouse and process/plant occupations, food preparation and hospitality occupations, teachers (primary school), teachers (secondary school), teachers (higher education), teaching assistants and support occupations, carers, social work and welfare occupations, cleaners, and salespeople/ cashiers/shopkeepers.
For further methodological details of exposure classification and UK SOC 2020 codes within each category, please see 'Occupational Classification' in the Supplementary Materials.

Outcomes
The outcome of interest was binary SARS-CoV-2 infection status (yes/no ever infected) based on any clinical evidence of infection (positive lateral flow (LFT), polymerase chain reaction (PCR), anti-nucleocapsid antibody serological test, or anti-spike antibody serological test in absence of vaccination. Susceptibility to reinfections was not the focus of this paper; consequently, outcome data correspond to participants' first infection and follow-up was ended after first infection. Please see 'Clinical Outcomes' in the Supplementary Material for further information about clinical data in Virus Watch and how infection status was derived.
Where possible, we attributed results to the following phases of the pandemic based on test date: Wave 1 and 2 (February 2020 to May 2021) characterised by stringent public health restrictions and the dominance of the SARS-CoV-2 wild type and subsequently Alpha variant in the UK; Wave 3 (June 2021 to November 2021) characterised by relaxation of restrictions and the dominance of the Delta variant; or Wave 4 (December 2021 to April 2022) characterised by further relaxations of restrictions and the dominance of the Omicron variant. Test results were only available until 1 April 2022 due to the termination of the national testing programme in England affecting self-reported testing data and the termination of monthly serological testing in Virus Watch. Waves 1 and 2 were amalgamated into a single phase as it was not possible to attribute specific waves to serology tests conducted during Wave 2, and as mass population testing was largely introduced after the first pandemic wave in England and Wales. Both Waves 1 and 2 included periods of stringent public health restrictions, whilst Waves 3 and 4 occurred during the relaxation of public health measures in included regions, with a brief reintroduction of some limited restrictions in December 2021 and January 2022 due to the emergence of the Omicron variant. Some infections could not be attributed to a particular period as they were based on seropositivity without a prior seronegative result.
Models were adjusted for non-work public activities based on monthly surveys where participants reported the median number of days that they undertook the following activities across each survey week: using transport (using a bus, underground or overground train/ tram, taxi, or sharing a car with a non-household member), visiting essential shops, and leisure and social activities (attending the theatre, cinema, concert or sports event; eating in a restaurant, cafe or canteen; going to a bar, pub or club; going to a party; or non-essential shops or personal care services). Responses from November 2020 and February-April 2021 were allocated to Waves 1 and 2, with the second wave used to extrapolate to both early phases of the pandemic. Responses from May 2021-October 2021 were allocated to Wave 3, and from November 2021 -March 2022 to Wave 4. Monthly surveys were conducted towards the end of each month, so surveys conducted on the boundary months between pandemic waves were allocated to the subsequent wave.

Statistical Analyses
To assess the influence of occupation on SARS-CoV-2 infection risk, we performed Poisson regression with robust standard errors, an established method to estimate risk ratios for binary outcomes [23]. Separate models were conducted for the full pandemic and by wave, with the reference category set as (1) 'Other Professional and Associate occupations' and (2) the full working population of Virus Watch excluding the occupational group under consideration. Other Professional and Associate occupations were selected as a reference group following similar criteria to previous studies of occupation and COVID-19 [4,11], as this was the largest occupational group in Virus Watch, had a low absolute infection risk (see Supplementary Table S2) and was a non-frontline group with low prior estimates of exposure-relevant workplace factors [6,8].
We identified potential confounders based on a purpose-developed directed acyclic graph (DAG-see 'Directed Acyclic Graphs' in Supplementary Materials), with models presented unadjusted and fully adjusted for the following potential confounders: age, sex, ethnicity, region, deprivation and household size, vulnerability status, and non-work public activities. While adjustment for non-work public activities was not required for minimally-sufficient adjustment according to our DAG, this was included in the final adjustment set. Evidence around the relationship between occupation and nonwork public activities is lacking and any association is likely to involve complex inter-relationships with socioeconomic position and other demographic factors; however, the potential impact of non-work public activities that may occur adjacent to work (e.g., transport use) has been previously discussed in the context of occupational infection risk [24] and we consequently included this in our final model given longitudinal availability of these data in Virus Watch. Additionally, deprivation and socioeconomic position can be challenging to quantify and adjustment for non-work public activities may mitigate against residual confounding via these pathways. Vaccination status was not directly included in the models due to being a determinant of vaccination status according to UK policy, and therefore on the causal pathway between occupation and infection risk. Additionally, other factors determining vaccination according to UK policy (i.e., age, health status) were included in the models and our DAG did not indicate any open confounding pathways ( Supplementary Figs. 1a and b). No evidence of multicollinearity emerged based on variance inflation factors for any model. We performed a sensitivity analysis limited to participants who had undergone serological testing (n = 9114) to address potential differential access and testing behaviour for virological/antigen testing across occupations; it was only possible to perform this analysis on broad occupational groups across the full study period, and not for specific occupations or by wave due to limited statistical power (see 'Clinical Outcomes' in the Supplementary Materials).
Based on the fully-adjusted models, we calculated attributable fractions for the exposed subpopulations (AFs) using the punaf programme in Stata Version 16 [25]. Attributable fractions range from -8 to 1, with negative values indicating a protective effect and positive values indicating a harmful effect [25]; while negative values are often transformed to express cases prevented in the unexposed group, we did not transform estimates in order to facilitate comparison by leaving all estimates with the same denominator.
Missing data were limited for all included sociodemographic variables (0-6%) and complete cases were included in the final analyses. We conducted a missing data sensitivity analysis by applying multivariate imputation by chained equations (mice package in R Version 4.0.3 [26]) with 5 datasets with 50 iterations per dataset to socio-demographic variables and re-testing models.

Results
Participant selection is presented in Fig. 1, with demographic features of included participants (n = 15,190) reported in Table 1.

Occupational group and infection risk
Absolute risk of infection by occupational risk is reported in Supplementary Table S2, and ranged from 26% in outdoor tradespeople to 42% in teaching, education and childcare workers across the full study period.
Risk ratios comparing each occupational group with 'Other Professional and Associate' occupations are illustrated in Fig. 2 for the full study period and stratified by wave. Across the full study period, healthcare (adjusted risk ratio (aRR) = 1.29, 1.18-1.40; attributable fraction  Table S3 for adjusted AFs). When stratified by wave, between-occupational differences in infection risk were most prominent in Waves 1 and 2, and were attenuated in Waves 3 and 4 (Fig. 2). Most of the above groups demonstrated elevated risk in Waves 1 and 2along with indoor trades and process/plant occupations (aRR = 1.44, 1.18-1.76; AF = 31%, 15-43%) and sales and customer service occupations (aRR = 1.29, 1.02-1.94; AF = 22%, 2-39%). The only groups with elevated risk in later waves were teaching, education, and childcare workers -who had elevated risk across all wavesand healthcare workers who had elevated risk in Wave 4 (Fig. 2). Similar results were obtained in sensitivity analyses including only participants who underwent serological testing ( Supplementary Fig. 2), and including imputed sociodemographic data (Supplementary Fig. 3a). Across all models, adjustment for sociodemographic, healthrelated and non-workplace activities had limited effects (Fig. 2). Similar between-occupational trends were obtained when comparing each occupational group to the rest of the working population ( Fig. 3 and Supplementary Table  S2), with lower risk ratios and attributable fractions than those compared to Other Professional and Associate occupations. Similar results were also observed in related sensitivity analyses comparing each occupation to the rest of the working population based on serological testing only (Supplementary Fig. 2) and including imputed sociodemographic data (Supplementary Fig. 3b).

Specific Frontline Occupations
Absolute risk of infection for specific frontline occupations is reported in Supplementary Table S4; primary school teachers demonstrated the highest absolute risk (53%) across the full study period.

Key Findings and Interpretation
This study found persistent occupational differences in SARS-CoV-2 infection risk after comprehensive adjustment for non-work-related confounding, including sociodemographic and health-related factors and non-work social activities. Compared to Other Professional and Associate occupations-the largest occupational group in the sample with the lowest infection risk-workers in healthcare, teaching, education and childcare, social care and community protective services, and leisure and personal service occupations demonstrated elevated overall infection risk. In these groups, belonging to their occupation compared to the less risky group accounted for between 13% (for leisure and personal service workers) to 25% (for teaching, education and childcare workers) of their infection risk. Most at-risk occupations demonstrated elevated risk in the earlier pandemic phase which was later attenuated for most groups, with the exception of teaching, education and childcare occupations  for whom risk remained elevated in Waves 3 and 4, and healthcare workers who also had elevated risk in Wave 4. Where sample size was sufficient, we also investigated infection risk for specific frontline occupational groups. Occupational and temporal patterns mirrored those described above, with differences most apparent in earlier pandemic phases and persistent across all phases for teachers and teaching support workers. Substantial attributable fractions were identified for at-risk workers, with occupation accounting for between 25% (in doctors) to 40% (in primary school teachers) of workers' risk of infection. Findings may have been impacted by lack of power to detect modest effects in some groups.
Elevated infection risk in occupational groups with limited ability to work from home and those involving exposure to patients and/or the public echoes findings from the previous studies with more limited adjustment for potential confounding and from other global regions [1-5, 12, 13]. Across all analyses in the current study, adjustment for sociodemographic and health-related factors and non-work activities had limited impact on estimates. This result differs markedly to prior analysis  of occupational differences in COVID-19 mortality [11], in which adjustment for socio-demographic and health-related factors substantially reduced the effect of occupation. Occupation plausibly shapes SARS-CoV-2 exposure-and consequently infection risk-by influencing workers' ability to work from home, practise social distancing at work, work in well-ventilated environments, and access appropriate personal protective equipment. The specific mechanisms and relative contribution of different mitigating factors are likely to differ considerably by occupation, and are an important area for future research. Conversely, clinical factors that influence risk of severe morbidity and mortality once infected may differ across occupations, however the direct effect of occupation itself on severity of infection is likely to be more limited. Changing patterns of differential infection risk by pandemic phase are likely to be multifactorial. Immunityrelated factors that reduce the population of susceptible workers within a given occupation are likely to be important, and include prior infection in early phases of the pandemic, prioritization of some occupational groups (i.e. health and care workers [11,27]) for vaccination, and potential differences in the speed and overall uptake of vaccination between occupations [28]. The removal of remaining public health restrictions in Wave 3 may also have reduced differential risk by increasing overall contact rates and networks, and probability of transmission outside of work due to the increasing range of potential venues for exposure at a time of persistently high community infection rates and reduced mitigations. Resurgent risk in healthcare workers, particularly nurses, in the fourth wave may reflect the impact of relaxed restrictions on some healthcare workers with intensive patient contact as well as the impact of the immune-evasive Omicron variant. Relatedly, persistently elevated risk in teaching and childcare occupations may reflect highintensity workplace exposure in combination with high levels of infection in children [29,30]. Direct investigation into potential mediators of this phase effect was beyond the scope of this study, and is warranted to better understand the processes shaping occupational infection risk. Relatedly, investigation into effective mitigation for the ongoing elevated infection risk in teachers is recommended both to address occupational inequalities and to reduce disruption in education settings.

Strengths and Limitations
Strengths of this study include the large and diverse cohort that enabled investigation of infection risk from multiple study-derived and linked sources including both symptomatic testing and serology over multiple pandemic phases. Detailed information around participants' demographic characteristics and activities over time allowed adjustment for a comprehensive series of potential confounders, including non-work-related public activities, informed by a directed acyclic graph.
However, the study has several important limitations. The Virus Watch cohort is demographically diverse but not representative of the UK population. The study population comprised larger numbers of workers in professional and administrative occupations relative to trades and leisure and personal service occupations. Power to detect effects may consequently have been reduced in the latter groups. Occupation was classified using officiallyrecommended, semi-automatic methods [19,20]; consequently, any automatically-classified occupations that were marked as not having high certainty by the classification software were manually checked by a trained investigator to avoid misclassification [20]. Some misclassification may have remained due to undetected random error, and may have attenuated the magnitude of effects.
Potential confounders, such as deprivation, are challenging to measure and residual confounding cannot be excluded. Non-work public activities were inferred from self-reported activities across a given survey week, and may not have been an accurate reflection of participants' activity patterns across the entire relevant time period. Furthermore, social and leisure activities may have included work for some occupational groups (e.g. leisure and personal service occupations) but could not be disaggregated; however, the limited effect of adjustment in these models indicates that this was unlikely to be a major source of bias. Occupation was measured in broad categories, and only some specific occupations could be investigated due to small subsample sizes. Relatedly, the number of infections within a given pandemic phase was small for some frontline subsamples. Overall estimates of risk by occupational sector may be driven by particularly risky roles with considerable exposure [9], and further investigation into specific occupations is recommended. Additionally, inclusion of multiple test types to indicate SARS-CoV-2 positivity allowed for potential detection of asymptomatic or previously untested cases through serology, and detection of early cases through linkage. However, issues impacting the uptake and usage of each test type, including differential access to some tests in given phases of the pandemic, self-selection bias, and compliance with testing instructions may have affected estimates and are difficult to delineate. Notably, swab testing uptake may be influenced by differential testing behaviour between occupations. For example, health care workers undertake regular occupational testing which may lead to an overestimation of their relative risk of infection. However, a sensitivity analysis constrained to those

Conclusions
Despite these limitations, the present study indicates differential infection risk across occupational groups in England and Wales, with patterns of differential risk appearing to vary across pandemic waves. These findings illustrate the importance of work as a source of infection risk during the COVID-19 pandemic, with substantial fractions of infections attributable to occupation in at-risk groups. Occupations with persistently elevated risk (i.e., teachers) should be an ongoing target for interventions such as improved ventilation in schools, while understanding processes that shape differential risk in earlier phases of the pandemic is relevant for future outbreaks of respiratory infections. Investigation into the mechanisms underlying differential risk overall and over time, as suggested by this study, could inform evidence-based public health interventions in the workplace.